AITopics | contrastive search

Collaborating Authors

contrastive search

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

The Truncation Blind Spot: How Decoding Strategies Systematically Exclude Human-Like Token Choices

Arias, Esteban Garces, Sapargali, Nurzhan, Heumann, Christian, Aßenmacher, Matthias

arXiv.org Machine LearningMar-20-2026

Standard decoding strategies for text generation, including top-k, nucleus sampling, and contrastive search, select tokens based on likelihood, restricting selection to high-probability regions. Human language production operates differently: tokens are chosen for communicative appropriateness rather than statistical frequency. This mismatch creates a truncation blind spot: contextually appropriate but statistically rare tokens remain accessible to humans yet unreachable by likelihood-based decoding. We hypothesize this contributes to the detectability of machine-generated text. Analyzing over 1.8 million texts across eight language models, five decoding strategies, and 53 hyperparameter configurations, we find that 8-18% of human-selected tokens fall outside typical truncation boundaries. Simple classifiers trained on predictability and lexical diversity achieve remarkable detection rates. Crucially, neither model scale nor architecture correlates strongly with detectability; truncation parameters account for most variance. Configurations achieving low detectability often produce incoherent text, indicating that evading detection and producing natural text are distinct objectives. These findings suggest detectability is enhanced by likelihood-based token selection, not merely a matter of model capability.

large language model, machine learning, natural language, (18 more...)

arXiv.org Machine Learning

2603.18482

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
(6 more...)

Genre: Research Report > New Finding (0.88)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.51)

Add feedback

871cae8f599cb8bbfcb0f58fe1af95ad-Paper-Conference.pdf

Neural Information Processing SystemsFeb-10-2026, 12:16:51 GMT

contrastive search, representation, simctg, (15 more...)

Neural Information Processing Systems

Country:

Asia > China > Hong Kong (0.04)
Africa > Ethiopia > Addis Ababa > Addis Ababa (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
(8 more...)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

Add feedback

871cae8f599cb8bbfcb0f58fe1af95ad-Paper-Conference.pdf

Neural Information Processing SystemsAug-16-2025, 16:16:22 GMT

contrastive search, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country:

Asia > China > Hong Kong (0.04)
Africa > Ethiopia > Addis Ababa > Addis Ababa (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
(9 more...)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Context-Enhanced Contrastive Search for Improved LLM Text Generation

Sen, Jaydip, Pandey, Rohit, Waghela, Hetvi

arXiv.org Artificial IntelligenceMay-1-2025

--Recently, Large Language Models (LLMs) have demonstrated remarkable advancements in Natural Language Processing (NLP). However, generating high-quality text that balances coherence, diversity, and relevance remain s challenging. Traditional decoding methods, such as bean search and top-k sampling, often struggle with either repe titive or incoherent outputs, particularly in tasks that require long-form text generation. To address these limitations, the paper proposes a novel enhancement of the well-known Contrastive S earch algorithm, Context-Enhanced Contrastive Search (CEC S) with contextual calibration. The proposed scheme introduces several novelties including dynamic contextual importance w eighting, multi-level Contrastive Search, and adaptive temper ature control, to optimize the balance between fluency, creativity, and precision. The performance of CECS is evaluated usi ng several standard metrics such as BLEU, ROUGE, and semantic similarity. Experimental results demonstrate signif icant improvements in both coherence and relevance of the generated texts by CECS outperforming the existing Contrastive Search techniques. The proposed algorithm has several pote ntial applications in the real world including legal document drafting, customer service chatbots, and content marketing. In recent years, Large Language Models (LLMs) have transformed the field of Natural Language Processing (NLP), delivering cutting-edge performance across numerous tasks, including text generation, summarization, machine translation, and question answering. Models such as OpenAI's GPT-3 [1], Google's BERT [2], and more recently PaLM [3], have greatly enhanced the capabilities of machines in understanding and generating human language. By leveraging deep neural network architectures and training on extensive datasets, LLMs have made significant strides in pro ducing fluent and coherent text that closely resembles hum an communication. Generating text from an LLM involves more than simp ly predicting the next word in a sequence according to its probability distribution. This step, known as decod ing, plays a critical role in shaping the final output. Various decoding strategies have been proposed in the literature ranging from deterministic methods such as beam search, to stoch astic methods like top-k and nucleus sampling. While the deterministic methods choose the highest probability token at each step, their stochastic counterparts introduce randomness to improve diversity in the generated output.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2504.2102

Country:

Europe (0.46)
Asia (0.28)
North America > United States > Minnesota (0.28)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Decoding Decoded: Understanding Hyperparameter Effects in Open-Ended Text Generation

Arias, Esteban Garces, Li, Meimingwei, Heumann, Christian, Aßenmacher, Matthias

arXiv.org Artificial IntelligenceDec-14-2024

Decoding strategies for generative large language models (LLMs) are a critical but often underexplored aspect of text generation tasks. Guided by specific hyperparameters, these strategies aim to transform the raw probability distributions produced by language models into coherent, fluent text. In this study, we undertake a large-scale empirical assessment of a range of decoding methods, open-source LLMs, textual domains, and evaluation protocols to determine how hyperparameter choices shape the outputs. Our experiments include both factual (e.g., news) and creative (e.g., fiction) domains, and incorporate a broad suite of automatic evaluation metrics alongside human judgments. Through extensive sensitivity analyses, we distill practical recommendations for selecting and tuning hyperparameters, noting that optimal configurations vary across models and tasks. By synthesizing these insights, this study provides actionable guidance for refining decoding strategies, enabling researchers and practitioners to achieve higher-quality, more reliable, and context-appropriate text generation outcomes.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2410.06097

Country:

Europe > France (0.68)
Asia > Afghanistan > Kabul Province > Kabul (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
(8 more...)

Genre:

Research Report > New Finding (0.87)
Research Report > Experimental Study (0.66)

Industry: Government > Regional Government > Europe Government > France Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.30)

Add feedback

Adaptive Contrastive Search: Uncertainty-Guided Decoding for Open-Ended Text Generation

Arias, Esteban Garces, Rodemann, Julian, Li, Meimingwei, Heumann, Christian, Aßenmacher, Matthias

arXiv.org Machine LearningJul-26-2024

Decoding from the output distributions of large language models to produce high-quality text is a complex challenge in language modeling. Various approaches, such as beam search, sampling with temperature, $k-$sampling, nucleus $p-$sampling, typical decoding, contrastive decoding, and contrastive search, have been proposed to address this problem, aiming to improve coherence, diversity, as well as resemblance to human-generated text. In this study, we introduce adaptive contrastive search, a novel decoding strategy extending contrastive search by incorporating an adaptive degeneration penalty, guided by the estimated uncertainty of the model at each generation step. This strategy is designed to enhance both the creativity and diversity of the language modeling process while at the same time producing coherent and high-quality generated text output. Our findings indicate performance enhancement in both aspects, across different model architectures and datasets, underscoring the effectiveness of our method in text generation tasks. Our code base, datasets, and models are publicly available.

entropy, mauve, text generation, (15 more...)

arXiv.org Machine Learning

2407.18698

Country:

Asia > Afghanistan > Kabul Province > Kabul (0.04)
North America > United States > New York (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
(7 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Transportation (1.00)
Law Enforcement & Public Safety (1.00)
Government > Regional Government > North America Government > United States Government (0.68)
Government > Military > Army (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.56)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.36)

Add feedback

Duwak: Dual Watermarks in Large Language Models

Zhu, Chaoyi, Galjaard, Jeroen, Chen, Pin-Yu, Chen, Lydia Y.

arXiv.org Artificial IntelligenceMar-12-2024

As large language models (LLM) are increasingly used for text generation tasks, it is critical to audit their usages, govern their applications, and mitigate their potential harms. Existing watermark techniques are shown effective in embedding single human-imperceptible and machine-detectable patterns without significantly affecting generated text quality and semantics. However, the efficiency in detecting watermarks, i.e., the minimum number of tokens required to assert detection with significance and robustness against post-editing, is still debatable. In this paper, we propose, Duwak, to fundamentally enhance the efficiency and quality of watermarking by embedding dual secret patterns in both token probability distribution and sampling schemes. To mitigate expression degradation caused by biasing toward certain tokens, we design a contrastive search to watermark the sampling scheme, which minimizes the token repetition and enhances the diversity. We theoretically explain the interdependency of the two watermarks within Duwak. We evaluate Duwak extensively on Llama2 under various post-editing attacks, against four state-of-the-art watermarking techniques and combinations of them. Our results show that Duwak marked text achieves the highest watermarked text quality at the lowest required token count for detection, up to 70% tokens less than existing approaches, especially under post paraphrasing.

duwak, probability, watermark, (15 more...)

arXiv.org Artificial Intelligence

2403.13

Country:

Europe > Netherlands > South Holland > Delft (0.05)
Oceania > Australia > Victoria > Melbourne (0.04)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
(6 more...)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.71)

Add feedback

Fine-grained Conversational Decoding via Isotropic and Proximal Search

Yao, Yuxuan, Wu, Han, Xu, Qiling, Song, Linqi

arXiv.org Artificial IntelligenceNov-14-2023

General-purpose text decoding approaches are usually adopted for dialogue response generation. Although the quality of the generated responses can be improved with dialogue-specific encoding methods, conversational decoding methods are still under-explored. Inspired by \citet{wu2023learning} that a good dialogue feature space should follow the rules of locality and isotropy, we present a fine-grained conversational decoding method, termed \textit{isotropic and proximal search (IPS)}. Our method is designed to generate the semantic-concentrated response, while still maintaining informativeness and discrimination against the context. Experiments show that our approach outperforms existing decoding strategies in the dialogue field across both automatic and human evaluation metrics. More in-depth analyses further confirm the effectiveness of our approach.

contrastive search, representation, utterance, (16 more...)

arXiv.org Artificial Intelligence

2310.0813

Country:

Oceania > Australia > Victoria > Melbourne (0.04)
Asia > China > Hong Kong (0.04)
North America > United States > Texas > Travis County > Austin (0.04)
(9 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Fidelity-Enriched Contrastive Search: Reconciling the Faithfulness-Diversity Trade-Off in Text Generation

Chen, Wei-Lin, Wu, Cheng-Kuang, Chen, Hsin-Hsi, Chen, Chung-Chi

arXiv.org Artificial IntelligenceOct-23-2023

In this paper, we address the hallucination problem commonly found in natural language generation tasks. Language models often generate fluent and convincing content but can lack consistency with the provided source, resulting in potential inaccuracies. We propose a new decoding method called Fidelity-Enriched Contrastive Search (FECS), which augments the contrastive search framework with context-aware regularization terms. FECS promotes tokens that are semantically similar to the provided source while penalizing repetitiveness in the generated text. We demonstrate its effectiveness across two tasks prone to hallucination: abstractive summarization and dialogue generation. Results show that FECS consistently enhances faithfulness across various language model sizes while maintaining output diversity comparable to well-performing decoding algorithms.

computational linguistic, contrastive search, summarization, (16 more...)

arXiv.org Artificial Intelligence

2310.14981

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
Asia > Taiwan (0.05)
North America > Dominican Republic (0.04)
(9 more...)

Genre: Research Report > New Finding (0.48)

Industry: Leisure & Entertainment > Sports (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Generation (0.54)

Add feedback

Weigh Your Own Words: Improving Hate Speech Counter Narrative Generation via Attention Regularization

Bonaldi, Helena, Attanasio, Giuseppe, Nozza, Debora, Guerini, Marco

arXiv.org Artificial IntelligenceSep-5-2023

Recent computational approaches for combating online hate speech involve the automatic generation of counter narratives by adapting Pretrained Transformer-based Language Models (PLMs) with human-curated data. This process, however, can produce in-domain overfitting, resulting in models generating acceptable narratives only for hatred similar to training data, with little portability to other targets or to real-world toxic language. This paper introduces novel attention regularization methodologies to improve the generalization capabilities of PLMs for counter narratives generation. Overfitting to training-specific terms is then discouraged, resulting in more diverse and richer narratives. We experiment with two attention-based regularization techniques on a benchmark English dataset. Regularized models produce better counter narratives than state-of-the-art approaches in most cases, both in terms of automatic metrics and human evaluation, especially when hateful targets are not present in the training data. This work paves the way for better and more flexible counter-speech generation models, a task for which datasets are highly challenging to produce.

computational linguistic, experiment, regularization, (14 more...)

arXiv.org Artificial Intelligence

2309.02311

Country:

Europe > Italy > Trentino-Alto Adige/Südtirol > Trentino Province > Trento (0.04)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
(9 more...)

Genre: Research Report > New Finding (0.68)

Industry:

Law (0.46)
Law Enforcement & Public Safety (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback